From speech corpus to intonation corpus: clustering phrase pitch contours of Lithuanian

نویسندگان

  • Gailius Raskinis
  • Asta Kazlauskiene
چکیده

This paper presents our research in preparation to compile a Lithuanian intonation corpus. The main objective of this research was to discover characteristic patterns of Lithuanian intonation through clustering of pitch contours of intermediate intonation phrases. The paper covers the set of procedures that were used to extend an ordinary speech corpus to make it suitable for intonation analysis. The process of intonation analysis included pitch extraction, pitch normalization, estimation of the representative frequency of a syllable, measurement of an inter-phrase similarity, k-means phrase clustering, and visualisation of clustering results. These computational procedures were applied to 23 hours of read speech containing 41417 phrases. The clustering results revealed some interesting intonation patterns of Lithuanian that could be related to the well known linguistic-prosodic phenomena. Language-independence is an important feature of computational procedures covered by this paper. If speech waveforms and the knowledge of phone and phrase boundaries are given, these procedures can be used for the analysis of intonation of other languages.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Clustering of foot-based pitch contours in expressive speech

Intonation generation is still one of the weak links in the textto-speech synthesis chain. It is a hard enough task to generate expressively neutral pitch contours, with accurate placement of accents and phrase boundaries, but to generate appropriate intonation for expressive speech is even more of a challenge. This paper is a first attempt at describing and categorizing the variation in pitch ...

متن کامل

Prosody annotation for corpus based speech synthesis

The paper concerns prosody annotation especially for application in a corpus based speech synthesis. In order to establish the rules of automatic intonation modelling, phonetically labeled speech database of 4 hours has been perceptually and acoustically analyzed. The speech material included different text types and prosodically rich phrases. The annotation of the speech database consists in p...

متن کامل

The stylization of intonation contours

This paper presents the stylization of intonation contours and clustering of F0 movements on accented and post-accented syllables based on annotated speech corpora. Special software – PitchLine – has been developed to enable the flexible quasiautomatic segmentation and parametrization of intonation curves. The experimental material obtained from a 15 min passage read by a male speaker included ...

متن کامل

Using Zero-Frequency Resonator to Extract Multilingual Intonation Structure

Human uses expressive intonation to convey linguistic and paralinguistic meaning, especially making focal prominence to give emphasis that highlights the focus of speech. Automatic extraction of dynamic intonation feature from a speech corpus and representing it in a continuous form are desired in multilingual speech synthesis. This paper presents a method to extract dynamic prosodic structure ...

متن کامل

Corpus-Based Hidden Markov Modelling of the Fundamental Frequency of Lithuanian

This paper presents the corpus-driven approach in building the computational model of fundamental frequency, or F0, for Lithuanian language. The model was obtained by training the HMM-based speech synthesis system HTS on six hours of speech coming from multiple speakers. Several gender specific models, using different parameters and different contextual factors, were investigated. The models we...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013